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Question 4 
Intent of Question 


The primary goals of this question were to assess a student’s ability to use boxplots to (1) compare multiple 
sets of data; (2) identify which set of data is most likely to have produced a particular summary value; and 
(3) determine which variable is most useful for classifying a new observation. 


Solution 
Part (a): 


The median value for the percent of chemical Z in the pottery pieces is similar for all three sites, at 
about 7 percent. The ranges for the percent of chemical Z are much different for the three sites, with 
the smallest range of about 2 percent (from 6 percent to 8 percent) at Site II, a range of about 6 percent 
(from about 4 percent to 10 percent) at Site I, and the largest range of about 8 percent (from about 3 
percent to 11 percent) at Site IIL. 


Part (b): 


(i) The piece most likely originated at Site III. Although values outside of the range of data observed in 
the samples would be possible, using the available data results in approximate minimum and 
maximum sums of the percents for the three chemicals as shown in the table below. Site III is the 
only site in which 20.5 falls between the sums of the minimum and maximum values. 
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(ii) Chemical Y would be most useful, because the distribution of the percentages of total weights at 
the three sites do not overlap. ‘The distributions of chemicals X and Z have substantial overlap. 


Scoring 
This question is scored in three sections. Section 1 consists of part (a), section 2 consists of part (b-i), and 
section 3 consists of part (b-ii). Each section is scored as essentially correct (E), partially correct (P), or 
incorrect (I). 
Section 1 is scored as follows: 
Essentially correct (E) if the response includes the following three components: 
1. Recognition that the medians or centers are almost the same for the three sites 
2. Recognition that the variability (tanges, IORs, spread) is different across the three sites 
3. Context is included 


Partially correct (P) if the response includes only two of the three components. 


Incorrect (I) if the response includes at most one of the three components. 
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Question 4 (continued) 


Notes: 

e Inall sections, comments about shape should be ignored because complete shape information 
is not obtainable from boxplots. 

e Responses are not required to give numerical values. If responses provide numerical values, 
any reasonable approximation from the boxplots is acceptable. 

e Because the boxplots are all symmetric, it is acceptable if the response discusses means 
instead of medians. 

e Any discussion of chemical X and chemical Y is considered extraneous. 

e Context is satisfied by any of the following references: site, chemical, weight, total weight, 
) nae Oa 


Section 2 is scored as follows: 


Essentially correct (E) if the response includes the following three components: 
1. Site II is chosen. 
2. sums of the minimum and maximum are computed for the three chemicals at each site. 
3. Areasonable numerical justification is given involving sums of a statistical measure across the 
three chemicals to choose Site III. 


Partially correct (P) if the response includes only two of the three components. 
Incorrect (I) if the response includes at most one of the three components. 


Notes: 

e Ifthe response computes only the sum of the minimums for Site I and the sum of the 
maximums for Site II and recognizes that this is sufficient, the response is scored E. 

e If an alternative measure is used that involves sums of the three chemicals, such as the sum of 
the medians or the sums of the first quartiles and sums of the third quartiles, instead of the 
minimum and maximum sums, the second component is not satisfied, but the third component 
might be satisfied. 

o Ifthe response explicitly or implicitly compares the alternate sum to the other two sites 
(for example, by indicating that the sum is the closest to 20.5 percent or by listing the 
sums for all three sites) the response is scored P. 

o Ifthe response does not have an implicit or explicit comparison, the response is 
scored I. 

e If either Site I or Site I is identified as the correct choice, no matter how that choice is justified, 
the response is scored I. 

e ‘The approximate sums of the medians are 27.5 for Site I, 16 for Site I], and 20 for Site III. 


Section 3 is scored as follows: 


Essentially correct (E) if the response chooses chemical Y AND gives a reasonable justification based 
on the fact that the distributions of chemical Y are distinctive across sites. 


Partially correct (P) if the response chooses chemical Y AND provides justification based on the 

boxplots, but does not clearly explain that the distributions of chemical Y are distinctive across sites: 
OR 

if the response correctly discusses that the distributions of chemical Y are distinctive across sites, but 

never explicitly chooses chemical Y as the best choice, for instance, by stating only that there is 

substantial overlap across sites for chemicals X and Z but no overlap for chemical Y. 
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Question 4 (continued) 


Incorrect (I) if the response does not meet the criteria for E or P. 


Notes: 


To justify that the distributions of chemical Y are distinctive across sites, the justification must 
address both location and variability of the boxplots; for example, by stating that the boxplots 
do not overlap for chemical Y. 
If the response chooses chemical X or Z OR chooses chemical Y with no reasonable 
justification, the response is scored I. 
The justification that the distributions of chemical Y are distinctive across sites: 

o The following are acceptable because both location and variability are addressed. Such 

responses are scored E. 


The boxplots for chemical Y do not overlap, or the boxplots for chemicals X and 
Z ovetlap. 

All values of Site I are high, all values of Site II are low, and all values of Site II] 
are in the middle. 

The ranges never intersect. 

The boxplots share no data. 

Has completely different percentages at each site. 


o The following are incomplete justifications and are scored P. 


The boxplots vary. 

Chemical Y varies the most. 

Chemical Y has the greatest variation. 

The variation between/among sites is the largest. 

The boxplots are different. 

The medians/means differ. 

The medians/means are most variable. 

There is a difference in the percentages of chemical Y for each site. 
The distribution of percents differs the most among the sites. 
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Question 4 (continued) 

Complete Response 

Three sections essentially correct 
Substantial Response 

Two sections essentially correct and one section partially correct 
Developing Response 

Two sections essentially correct and no sections partially correct 
OR 

One section essentially correct and one or two sections partially correct 
OR 

Three sections partially correct 
Minimal Response 

One section essentially correct 


OR 
No sections essentially correct and two sections partially correct 
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4, The chemicals in clay used to make pottery can differ depending on the geographical region where the clay 
originated. Sometimes, archaeologists use a chemical analysis of clay to help identify where a piece of pottery 
originated. Such an analysis measures the amount of a chemical in the clay as a percent of the total weight of the 
piece of pottery. The boxplots below summarize analyses done for three chemicals—X, Y, and Z—on pieces of 
pottery that originated at one of three sites: J, II, or III. 
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(a) For chemical Z, describe how the percents found in the pieces of pottery are similar and how they differ 
among the three sites. 
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(b) Consider a piece of pottery known to have originated at one of the three sites, but the actual site is not 
known. 


(i) Suppose an analysis of the clay reveals that the sum of the percents of the three chemicals X, Y, and Z 


is 20.5%. Based on the boxplots, which site—I, II, or [—is the most likely site where the piece of 
pottery originated? Justify your choice. 
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(ii) Suppose only one chemical could be analyzed in the piece of pottery. Which chemical—X, Y, or Z— 
would be the most useful in identifying the site where the piece of pottery originated? Justify your 
choice. . 
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4. The chemicals in clay used to make pottery can differ depending on the geographical region where the clay 
originated. Sometimes, archaeologists use a chemical analysis of clay to help identify where a piece of pottery 
originated. Such an analysis measures the amount of a chemical in the clay as a percent of the total weight of the 
piece of pottery. The boxplots below summarize analyses done for three chemicals—X, Y, and Z—on pieces of 
pottery that originated at one of three sites: I, II, or III. 
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(a) For chemical Z, describe how the percents found in the pieces of pottery are similar and how they differ 
among the three sites. f 
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(b) Consider a piece of pottery known to have originated at one of the three sites, but the actual site is not 


known. 
(1) Suppose an analysis of the clay reveals that the sum of the percents of the three chemicals X, Y, and Z 
is 20.5%. Based on the boxplots, which site—I, II, or I—is the most likely site where the piece of 


pottery originated? Justify your choice. 
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(ii) Suppose only one chemical could be analyzed in the piece of pottery. Which chemical—xX, Y, or Z— 
would be the most useful in identifying the site where the piece of pottery originated? Justify your 


choice. 
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4. The chemicals in clay used to make pottery can differ depending on the geographical region where the clay 
originated. Sometimes, archaeologists use a chemical analysis of clay to help identify where a piece of pottery 
originated. Such an analysis measures the amount of a chemical in the clay as a percent of the total weight of the 
piece of pottery. The boxplots below summarize analyses done for three chemicals—X, Y, and Z—on pieces of 
pottery that originated at one of three sites: I, II, or III. 
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(a) For chemical Z, describe how the percents found in the pieces of pottery are similar and how they differ 
among the three sites. 
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(b) Consider a piece of pottery known to have originated at one of the three sites, but the actual site is not 
known. 


(i) Suppose an analysis of the clay reveals that the sum of the percents of the three chemicals X, Y, and Z 


is 20.5%. Based on the boxplots, which site—I, II, or I1J—is the most likely site where the piece of 
pottery originated? Justify your choice. 
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(ii) Suppose only one chemical could be analyzed in the piece of pottery. Which chemical—X, Y, or Z— 


would be the most useful in identifying the site where the piece of pottery originated? Justify your 
choice. 
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Question 4 
Overview 


The primary goals of this question were to assess a student’s ability to use boxplots to (1) compare multiple 
sets of data; (2) identify which set of data is most likely to have produced a particular summary value; and 
(3) determine which variable is most useful for classifying a new observation. 


Sample: 4A 
Score: 4 


In section 1 the response clearly displays the intended similarities and differences by labeling each. ‘The 
response states that the median for chemical Z for all three sites is about 7 percent and satisfies both 
components 1 and 3. The response adds that the interquartile ranges of chemical Z for sites I and III are 
similar. The statement is true, but the question asked for a similarity among the three sites (not two). Finding 
a similarity for just sites | and III would not have been sufficient to satisfy component 1. ‘he response then 
lists the three different ranges for chemical Z across sites. Simply listing the ranges would not have satisfied 
component 2 in itself; however, the response clearly labels the listing as a difference, thus, component 2 is 
satisfied. Because the three components are satisfied, section 1 was scored as essentially correct. In 

section 2 site III is correctly identified as the most likely site of origin of the piece of pottery, satisfying 
component 1. The response clearly provides the sum of the minimums and the sum of the maximums of the 
three chemicals for each site. The calculations that lead to the sums are not required, but the numerical sums 
must be stated. Component 2 is satisfied. ‘The response justifies the choice of site IIT with an explanation that 
site III is the most likely site because 20.5 percent falls within the interval of 14 percent, the minimum sum, to 
26 percent, the maximum sum. Component 3 is satisfied. Because three components are satisfied, section 2 
was scored as essentially correct. In section 3 chemical Y is correctly chosen as the chemical most useful in 
identifying the site of origin of the piece of pottery, and the choice has a complete justification. Complete 
justification is provided because the response states the intervals of the minimums and maximums for each 
site and then states that these intervals do not overlap. The response refers to these intervals as “ranges, ” 
which is an incorrect usage of a statistical term. However, the reference is ignored because range is not in 
the intent of the question. ‘he response also notes that the corresponding intervals for chemicals X and Z do 
overlap. The justification is considered complete and has clear communication. Because the correct chemical 
is chosen with complete justification, section 3 was scored as essentially correct. Because three sections 
were scored as essentially correct, the response earned a score of 4. 


Sample: 4B 
Score: 3 


In section 1 the response states that the median of chemical Z is similar across all three sites, thus satisfying 
components 1 and 3. The response continues to explain that there are two differences: the interquartile 
ranges differ across sites, and the ranges differ across sites. Only one difference about variability across sites 
is required and either satisfies component 2. Because all three components are satisfied, section 1 was 
scored as essentially correct. In section 2 the response satisfies component 1 by correctly choosing site III as 
the mostly likely site of origin of the piece of pottery. Component 2 is not satisfied because the sum of the 
minimums and the sum of the maximums of the three chemicals for each site are not calculated. An 
alternative measure involving all three chemicals for each site is used, as the response calculates the sum of 
the medians for each chemical across the three sites. ‘To justify the choice of site III, the response states that 
the sum of the medians for site III is closest to 20.5 percent, compared with the sums from sites | and II. 
Because an alternative measure is used and compared to sites I and II, component 3 is satisfied. ‘I'wo of the 
three components in section 2 are satisfied, so section 2 was scored as partially correct. 
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Question 4 (continued) 


In section 3 chemical Y is chosen as the most useful chemical to identify the site of origin of the piece of 
pottery and justification based on the boxplots is provided. For each site, the interval from the minimum to 
the maximum for chemical Y is provided. The response notes that the intervals are different for chemical Y 
across the three sites. ‘he response also notes that the corresponding intervals for chemicals X and Z overlap 
across the sites. The justification is complete and section 3 was scored as essentially correct. Because two 
sections were scored as essentially correct, and one section was scored as partially correct, the response 
earned a score of 3. 


Sample: 4C 
Score: 2 


In section 1 the response states that the distribution for chemical Z across all three sites is approximately 
normal with similar medians. Because the response states, in context, that the medians are similar, 
components 1 and 3 are satisfied. ‘The comment that the distributions of chemical Z are approximately 
normal across sites is ignored. Complete information about shape cannot be determined from a boxplot. A 
symmetric boxplot could come from a normal distribution, a uniform distribution, or a bimodal distribution, 
for example. In the case of chemical Z the shape of the distribution is unknown. Component 2 is satisfied 
because the response states that site IIT has the largest range and site II has the smallest range. Finding a 
difference between two sites is acceptable for stating a difference across three sites. Because three 
components are satisfied, section 1 was scored as essentially correct. In section 2 the response correctly 
identifies site III as the most likely site of origin of the piece of pottery and component 1 is satisfied. ‘The 
sums of the minimums and the sums of the maximums of the three chemicals for each site are not 
calculated, and component 2 is not satisfied. An alternative measure, the sums of the medians, is calculated 
for all three sites. The response compares the sum of the medians for site III to sites I and II and states that 
20.5 percent is closest to the sum of the medians for site III. The response uses a reasonable numerical 
justification that involves the sum of a statistical measure from three chemicals across three sites, so 
component 3 is satisfied. Because two of the three components for section 2 are satisfied, section 2 was 
scored as partially correct. In section 3 chemical Y is identified as the most useful chemical in identifying the 
site of origin of the piece of pottery, but the choice has incomplete justification. ‘The response states that the 
“percents for chemical Y vary the most among the three sites.” Stating that the percents are the most 
variable does not ensure that the three boxplots do not overlap; that is, the percents could vary and still 
overlap. ‘he response also refers to chemical Y as having the “extremes of the range of percent of total 
weight in pottery across all three sites.” Range is a single number, but it appears as though this response 
considers range to be a span of numbers. Boxplots with large ranges could still overlap or not overlap, so it is 
not clear from the response that there is an understanding that chemical Y is the best choice because the 
boxplots do not overlap. Because the justification is incomplete for section 3, this section was scored as 
partially correct. Because one section was scored as essentially correct, and two sections were scored as 
partially correct, the response earned a score of 2. 
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